Plant Disease Detection and Classification: A Systematic Literature Review

A significant majority of the population in India makes their living through agriculture. Different illnesses that develop due to changing weather patterns and are caused by pathogenic organisms impact the yields of diverse plant species. The present article analyzed some of the existing techniques in terms of data sources, pre-processing techniques, feature extraction techniques, data augmentation techniques, models utilized for detecting and classifying diseases that affect the plant, how the quality of images was enhanced, how overfitting of the model was reduced, and accuracy. The research papers for this study were selected using various keywords from peer-reviewed publications from various databases published between 2010 and 2022. A total of 182 papers were identified and reviewed for their direct relevance to plant disease detection and classification, of which 75 papers were selected for this review after exclusion based on the title, abstract, conclusion, and full text. Researchers will find this work to be a useful resource in recognizing the potential of various existing techniques through data-driven approaches while identifying plant diseases by enhancing system performance and accuracy.


Introduction
Agricultural biodiversity is foundational to providing food and raw materials to humans. When pathogenic organisms such as fungi, bacteria, and nematodes; the soil pH; temperature extremes; changes in the amount of moisture and humidity in the air; and other factors continuously disrupt a plant, it can develop a disease. Various plant diseases can impact the growth, function, and structures of plants and crops, which automatically affect the people who are dependent on them. The majority of farmers still use manual methods to identify plant illnesses, since it is challenging to do so early on and has a negative impact on productivity. To overcome this, many deep learning (DL), image processing, and machine learning (ML) techniques are being developed, by which the detection of disease in a plant is performed by images of plant leaves.
Image processing is utilized to improve the quality of images in order to extract valuable information from them; because of this feature, image processing techniques are utilized in many areas, such as color processing, remote sensing, and pattern recognition, of the medical and agricultural fields. Images of plant leaves can be used to identify disease using image processing techniques that are appropriate, effective, and dependable. In image processing techniques, various stages are involved-image acquisition, image Using different keywords, 182 papers were extracted on which inclusion and exclusion operations were performed.

Conduction
This phase focuses on reviewing and summing up the selection criteria for assessing existing models based on ML, image processing techniques, and DL, including CNN, in terms of effective disease detection in different crops and plants using different datasets. In Figure 1, the entire research method utilized to produce this study is shown.
By conducting a keyword search, 182 papers on plant disease detection and classification from sources such as IEEE Xplore, SCOPUS Indexed Journal, and Google Scholar were retrieved that were published in the last 12 years from 2010 to 2022. Three stages made up the exclusion process. The retrieved data were then decreased to 164 based on their titles; publications were then eliminated based on their abstracts and conclusions; and, finally, 75 papers were found after reading the entire text. Figures 2 and 3 represent the number of papers reviewed by year from 2010 to 2022.
For the purpose of writing a systematic review, ten research questions were framed, which are specified in Table 2, and a complete evaluation procedure was conducted by monitoring the existing models for the purpose of addressing research questions.   made up the exclusion process. The retrieved data were then decreased to 164 based on their titles; publications were then eliminated based on their abstracts and conclusions; and, finally, 75 papers were found after reading the entire text. Figures 2 and 3 represent the number of papers reviewed by year from 2010 to 2022. For the purpose of writing a systematic review, ten research questions were framed, which are specified in Table 2, and a complete evaluation procedure was conducted by monitoring the existing models for the purpose of addressing research questions.   To identify the species with which evaluated studies are dealing and classes of diseases identified by reviewed studies. 10. RQ10: What is the accuracy of existing plant disease detection and classification approaches?
To identify the accuracy of existing approaches for identifying plant diseases.  To identify the species with which evaluated studies are dealing and classes of diseases identified by reviewed studies. 10. RQ10: What is the accuracy of existing plant disease detection and classification approaches?
To identify the accuracy of existing approaches for identifying plant diseases.

Related Work
On the basis of the data obtained from the chosen studies, the research methodology findings in this section provide answers to the research questions listed above. These automated models require significant training time, but once they are trained, they are incredibly accurate at spotting early-stage plant diseases and enabling farmers to take preventative action to lessen the effects of disease on productivity. Figure 4 shows various parameters that were considered for review. The approach that was utilized for conducting this literature review included data acquisition, pre-processing techniques, techniques for augmenting data, techniques for extracting features, different features that were extracted, techniques utilized for identification and classification, how the quality of images was enhanced, and techniques utilized for reducing overfitting of the models. The research questions (RQ1 to RQ10) listed in Table 2 are discussed in Sections 3.1-3.10.

Related Work
On the basis of the data obtained from the chosen studies, the research methodology findings in this section provide answers to the research questions listed above. These automated models require significant training time, but once they are trained, they are incredibly accurate at spotting early-stage plant diseases and enabling farmers to take preventative action to lessen the effects of disease on productivity. Figure 4 shows various parameters that were considered for review. The approach that was utilized for conducting this literature review included data acquisition, pre-processing techniques, techniques for augmenting data, techniques for extracting features, different features that were extracted, techniques utilized for identification and classification, how the quality of images was enhanced, and techniques utilized for reducing overfitting of the models. The research questions (RQ1 to RQ10) listed in Table 2 are discussed in Sections 3.1-3.10.   leaves from the plant village dataset. Both bacterial and fungal illnesses were included in this research. Chakraborty et al. [52] utilized images of 13 plant species and 17 kinds of illnesses, comprising approximately 2600 images, which were obtained from PlantDoc.
Abbas et al. [28] acquired diseased and healthy tomato leaf images from the opensource PlantVillage dataset. Wang et al. [53] acquired 3000 leaf images of various species, both healthy and disease-affected, that were gathered from the PlantVillage dataset. Divakar et al. [29] acquired image data that contained images of both diseased and healthy apple leaves, which were downloaded from a publicly accessible dataset on Kaggle. Chowdhury et al. [30] and Gonzalez-Huitron et al. [54] acquired image data of around 18,100 tomato leaves from the PlantVillage dataset. It was composed of ten classes, of which nine represented various disease-affected leaves and one contained healthy leaves. Akshai et al. [31] acquired images of about 4060 grape leaves, including both healthy and diseased leaves of various categories, from the PlantVillage dataset. Kibriya et al. [32] acquired around 10,000 tomato leaf images from the PlantVillage dataset, out of which 30% were utilized to test the suggested model, whereas 70% were used for training it.
B.V. et al. [33] for the purpose of identifying potato and tomato leaf diseases, utilized a subset of tomato and potato leaf images from the publicly accessible PlantVillage dataset. Jain et al. [55] Table 3 represents a summarized view of the different data acquisition sources utilized by the reviewed studies. Table 4 provides information about the sources in the real environment from which images were gathered by various researchers for their work. Table 3. Summarized view of data acquisition source.

S. No.
Year and Reference Data Acquisition Source A Germany-based commercial substrate was used to grow sugar beets in order to perform experiments with sugar beet leaves in a greenhouse. Spectral reflectance was measured using a portable non-imaging spectroradiometer, and the SPAD-502 chlorophyll meter was used to determine the amount of chlorophyll. Under the guidance of an expert, images of grape leaves were shot in Sangali, Pune, and Bijapur using a 16.1 Megapixel Nikon Coolpix P510 digital camera.

5.
2016 [38] Under the supervision of an agricultural expert, images of leaves were captured from various farms with a digital camera.

S. No.
Year and Reference Real Environment Description 6. 2016 [9] Images of cucumber leaves taken with a digital camera were provided by Japan's Research Center.

7.
2017 [11] Camera devices were used to capture images of tomato leaves, stems, and fruits in Korea's different farms at the early, medium, and late phases of disease.

10.
2018 [12] Images of cucumber leaves were provided by Japan's Research Center.

11
. 2019 [7] Using the Raspberry Pi Camera V2, pictures of chili stalks were taken at various heights and angles.

12.
2019 [46] The Nikon D7200 DSLR was used to take images of guava from several locations in different situations.
14. 2020 [48] China's Fujian Institute of Subtropical Botany supplied about 1000 leaf images of maize and rice. The shots were taken in environments with uneven lighting levels and cluttered field backgrounds. 15. 2020 [21] Using a Samsung Intelligent LCD camera, images of healthy and diseased leaves were taken. 16. 2020 [49] After several visits to farming regions, images of tomato leaves were gathered.
17. 2020 [50] Images of rice and maize leaves were captured from research farms related to the China's Fujian Institute Agricultural research farms. 18. 2020 [23] The 8 MP Samsung A7 smartphone camera was used to take images of lady finger leaves from fields in two villages in the Tiruvannamalai region. 19. 2021 [24] Images of soybean leaves were taken from soybean fields in India's Kolhapur region. 20. 2021 [27] Images of citrus leaves were captured with a 72 dpi resolution DSLR from a citrus research center in Sargodha City.
Observation 1 This observation is purely framed on the basis of discussions for RQ1 (3.1): 51% of the research under consideration acquired image data from publicly accessible datasets, while 44% employed digital cameras or other devices to collect images from the real environment, and the other 5% obtained their image data from other online sources. The primary publicly accessible datasets used in the evaluated studies were PlantVillage, PlantDoc, and other public datasets. All of the data acquisition sources are depicted in Figure 5.
the research under consideration acquired image data from publicly accessible datasets, while 44% employed digital cameras or other devices to collect images from the real environment, and the other 5% obtained their image data from other online sources. The primary publicly accessible datasets used in the evaluated studies were PlantVillage, PlantDoc, and other public datasets. All of the data acquisition sources are depicted in Figure 5.

Discussion for RQ2: What Different Pre-Processing Techniques Are Applied?
For further processing, image data were pre-processed utilizing a number of different techniques. This section involves a discussion on various pre-processing techniques that  For further processing, image data were pre-processed utilizing a number of different techniques. This section involves a discussion on various pre-processing techniques that have been employed by various researchers in their work. By using "Pre-processing techniques" as a filter, 34 papers were identified for this section, of which 26 papers were chosen for analysis.
Sannakki et al. [37] pre-processed images using anisotropic diffusion to produce space-variant and non-linear changes to the original images. Khirade et al. [57] utilized various image pre-processing techniques, including image smoothing, clipping, image enhancement, color conversion, and histogram equalization, to eliminate noise from the images. Rastogi et al. [58], prior to training and testing the proposed model, pre-processed the image data that were gathered during the image acquisition phase by resizing and cropping operations. Es-saddy et al. [38] first downsized the images into a standard size using a resizing operation, and then the noise was eliminated from them using a median filter to enhance their quality. Sladojevic et al. [8] performed two pre-processing operations, including resizing, where the image was scaled into 256 × 256 pixels, and cropping, which was performed to define the regions of interest in plant leaves for improved feature extraction.
Singh et al. [59] carried out several operations during the pre-processing stage to improve the quality of the image, including the clipping operation to extract the relevant image regions and the use of smoothing filters to improve the image's smoothness. In order to increase the image's contrast, image enhancement was used. In the study by Krithika et al. [60], during the pre-processing stage, pixels from grape leaf images' edges were deleted, and RGB data collected during the data acquisitionphase were transformed into the HSV and CIELAB color spaces. Ferentinos [61] performed image size reduction and cropping as part of the pre-processing operations that were performed on collected image data to make the images 256 × 256 pixels. Ramesh et al. [4] pre-processed the collected images to make them all the same size. Behera et al. [5] utilized two techniques for preprocessing images. The first technique used was image enhancement, which increased the contrast in the images and drew attention to any hidden details that may have been there, while another technique used was CIELAB color space, which shortened the computing time. Francis et al. [62] downsized images to 64 × 64 pixels using the resizing and cropping pre-processing techniques.
Devaraj et al. [63] utilized different MATLAB algorithms throughout the pre-processing step to downsize images, improve the contrast, and transform the RGB images into greyscale. Wahab et al. [7] utilized MATLAB's reb2gray function in the pre-processing step to convert RGB format images into grayscale while retaining luminance and removing hue and saturation. Howlader et al. [64] used Python code in pre-processing for the purpose of scaling all of the acquired images to 256 × 256 pixels. Sharma et al. [18] performed different pre-processing operations on images to enhance the quality of an image by eliminating noise from it. This was performed by enhancing compactness, changing brightness, extracting noise, and converting to another color space. Sahithya et al. [47] performed a resizing operation on the image to convert all of the images of the same standard size. Jadhav et al. [24] converted images into two different dimensions for AlexNet and GoogleNet. For AlexNet, a total of 649 images of soybean were pre-processed into dimensions of 227 × 227 × 3, whereas 550 images of soybean leaf samples underwent the same pre-processing for the proposed GoogleNet framework, resulting in dimensions of 224 × 224 × 3. Chen et al. [48], for the purpose of creating images of the same size, blackened shorter sides of images during the pre-processing step. Pham et al. [51], during the pre-processing stage, downscaled the images to a lower resolution, and pixel intensities were adjusted using contrast enhancement. Lijo [25] scaled images to 256 × 256 pixels during the pre-processing stage.
Chowdhury et al. [30] utilized various pre-processing operations. Operations such as resizing and normalization were carried out in the pre-processing step. All collected images were downsized to 224 × 224 for various EfficientNet approaches, while they were all converted to 256 × 256 for U-net segmentation techniques. In addition to resizing, the means and standard deviations of the images in the dataset were computed in order to normalize the z-score data. Kibriya et al. [32] utilized two distinct image processing methods, namely, resizing and denoising. The images were denoised using the Gaussian Blur filter, and all of the collected data were scaled to 225 × 225. Chouhan et al. [65] pre-processed image data using the resizing, restoration, and image enhancement techniques. Malathy et al. [1] pre-processed image data are utilizing image resizing and image restoration, which lessen image noise and improve the image's sharpness.
Jain et al. [55] improved the images' quality by employing a 3 × 3 Gaussian filter to remove noise from the image during pre-processing. Ashwinkumar et al. [66] utilized bilateral filtering, a non-linear filtering technique, during the pre-processing stage to enhance the quality of an image by eradicating noise from the acquired image data. Table 5 shows a summarized view of the various pre-processing methods applied in different reviewed studies.  This observation is solely based on RQ2 (3.2) discussions. The studies under evaluation included a variety of pre-processing methods, including scaling, clipping, smoothing, anisotropic diffusion, cropping, denoising, CIELAB color space, contrast improvement, converting RGB images to greyscale, increasing compactness, restoration, and normalizing. Figure 6 shows how various pre-processing methods are generally utilized. First, 30% of the papers that were examined used resizing to pre-process images, while the image improvement, cropping, and denoising operations were utilized by 30% of examined studies combinedly, 10% of each respectively. In the publications that were reviewed, 4% of the restoration, color conversion, clipping, and smoothing procedures were used individually, while 3% of the other pre-processing approaches were utilized individually.  Various data augmentation approaches can be utilized to enhance the dataset's image count in order to improve accuracy. In this section, techniques utilized by various researchers in their works to increase the size of dataset have been discussed. Twenty-eight papers were selected which used data augmentation for this section, and are currently  Various data augmentation approaches can be utilized to enhance the dataset's image count in order to improve accuracy. In this section, techniques utilized by various researchers in their works to increase the size of dataset have been discussed. Twenty-eight papers were selected which used data augmentation for this section, and are currently being considered for analysis.
Sladojevic et al. [8] utilized three different operations, namely, rotations, 3 × 3 transformation matrix-based perspective transformation, and affine transformations for augmenting images. Fujita et al. [9] utilized three different augmentation techniques-image shifting, mirroring, and image rotation-to expand the dataset. Dyrmann et al. [39] utilized rotation and mirroring techniques to expand the training dataset to 50,864 images (eight times the number of original images). Fuentes et al. [11] increased the image count in the training dataset through the use of the two image augmentation approaches, namely, geometrical transformation and intensity transformations. While procedures including image scaling, cropping, rotation, and horizontal flipping were carried out during geometrical transformation, intensity transformation dealt with noise, color, brightness enhancement, and contrast. Ma et al. [13] utilized the rotation and flipping operations to increase the amount of image data. Images in the dataset were rotated by 90, 180, and 270 degrees during the rotation process, but during the flipping operation, images were flipped in both the horizontal and vertical directions. Cap et al. [12] increased the dataset's image count utilizing cropping (from the center) and rotation (clockwise) operations.
Kobayashi et al. [67] utilized several augmentation techniques, including rotation, shear conversion, cutout, and horizontal and vertical direction, to expand the size of the dataset in order to improve detection accuracy. Geetharamani et al. [16] utilized augmentation operations such as flipping, principal component analysis, rotation, scaling, noise injection, and gamma correction to expand the dataset's size to approximately 61,400 images. Zhang et al. [17] utilized intensity transformations and geometric transformations to increase the number of images. There were five approaches used for the intensity transformations: contrast enhancement, color jittering, PCA jittering, blur (radial), and brightness enhancement. Images were enlarged, cropped, rotated, and flipped in geometric transformations (horizontally and vertically). Adedoja et al. [15] utilized different combinations of data augmentation techniques, including RandomRotate, RandomFlip, and RandomLighting, which added to images so that they could be evaluated from various perspectives. KC et al. [45] augmented the image data using cropping, flipping, shifting, rotating, and combining these techniques.
Haque et al. [46] applied several augmentation methods, including flipping (horizontal flip), zooming, shifting (height and breadth), rotating, nearest fill, and shearing, to lessen the overfitting of the guava images in the dataset. Coulibaly et al. [19] utilized four different operations by which images were augmented, namely, rescale, flipping, shift, and zoom. Ji et al. [20] increased the number of images of grape leaves with the aid of various data augmentation techniques, including rotation, zooming, flipping, shearing, and color changing. Chen et al. [48] utilized rotation, flip, scaling, and translation operations to increase the amount of image data in the utilized dataset. Kannan E et al. [68] utilized two different operations to increase the size of the dataset. Using RandomResizedCrop, where images were cropped to a size between 0.08 and 1; RandomRotation by 30 degrees; and both of these techniques together, the dataset was increased by fourfold.
Marzougui et al. [21] utilized the "Keras Image Data Generator" class, and operations such as flip, rotation, and shift were carried out to increase the amount of image data. Images were rotated by 30 degrees and flipped horizontally, the fill mode was set to nearest, and shift operations were carried out both vertically and horizontally for better results. Selvam et al. [23] performed five different augmentation operations, namely, rotation, flipping (horizontally), shear, zoom, and shift (height, width), to increase the count of images of lady's finger leaves. Lijo [25] utilized rotation, contrast enhancement, brightness enhancement, and noise reduction to increase the amount of image data. Divakar et al. [29] utilized the synthetic minority oversampling technique (SMOTE) to increase the count of images in the dataset in a balanced manner. Chowdhury et al. [30] performed three affine transformation operations-scaling, rotation (clockwise and anticlockwise), and translation (5% to 20% vertically and horizontally)-for the purpose of increasing image data.
Akshai et al. [31], to enhance the size of the dataset while reducing overfitting, utilized different augmentation techniques, such as rotation, shifting, and zooming. Gonzalez-Huitron et al. [54] performed horizontal flipping and four-angle rotation throughout the augmentation process. B.V. et al. [33] utilized the flip operation for the purpose of increasing the count of images in the dataset. Chelleapandi et al. [69] carried out five different data augmentation operations, including rotation, filling, flipping, zooming, and shearing, using the Keras library to enhance the dataset. Pandian et al. [34] utilized neural style transfer, position and color augmentation, deep convolutional generative adversarial network, and principal component analysis to increase the number of images from 55,448 to 234,008. Vallabhajosyula et al. [56] performed four different augmentation approaches-scaling, translation, rotation, and image enhancement-to increase the size of the dataset and to reduce overfitting. Table 6 shows a summarized view of the various data augmentation techniques utilized in different evaluated studies. Table 6. Summarized view of different data augmentation techniques.

S. No.
Year and Reference Augmentation Operation Performed  This observation is purely based on discussions of RQ3 (3.3). Numerous data augmentation methods, such as rotation, mirroring, cropping, flipping, PCA (color augmentation), zooming, shifting, scaling, RandomRotate, translation, etc., were used in the reviewed studies for increasing the dataset's image count. The overall utilization of the various augmentation techniques used in the reviewed studies is shown in Figure 7. From the figure, it is clearly evident that rotation was much more frequently utilized for increasing the dataset's image count than other methods (21%), while flipping came in second, with a value of 10%. Mirroring was employed for augmentation in 8% of the investigations, while zooming and shearing were utilized in 7% of the research, respectively. Additionally, 2% of the evaluated studies utilized affine transformation, mirroring, geometrical transformation, intensity transformation, cropping, PCA, RandomRotate, and translation, whereas only 1% of the other 20 techniques were used individually.

Discussion for RQ4: What Kinds of Feature Extraction Methods Are Employed?
This section involves a discussion of various feature extraction methods that were From the figure, it is clearly evident that rotation was much more frequently utilized for increasing the dataset's image count than other methods (21%), while flipping came in second, with a value of 10%. Mirroring was employed for augmentation in 8% of the investigations, while zooming and shearing were utilized in 7% of the research, respectively. Additionally, 2% of the evaluated studies utilized affine transformation, mirroring, geometrical transformation, intensity transformation, cropping, PCA, RandomRotate, and translation, whereas only 1% of the other 20 techniques were used individually.

Discussion for RQ4: What Kinds of Feature Extraction Methods Are Employed?
This section involves a discussion of various feature extraction methods that were utilized by various researchers in their works. Twenty-six papers were found by using feature extraction techniques for filtering, of which nineteen papers are utilized for analysis in this section.
Husin et al. [36] take color space into account; using this, the illumination from images can be reduced, allowing for an effective determination of whether a leaf is from a chili plant or not. Images were extracted for information pertaining to color matching, color identification, and color information. Dubey et al. [3] utilized the color coherence vector, global color histogram, complete local binary pattern, and local binary pattern methods for retrieving/extracting features. Sannakki et al. [37] utilized the color co-occurrence method for extracting texture features. Rastogi et al. [58] utilized the gray level co-occurrence matrix (GLCM) for extracting features. Es-saddy et al. [38] extracted three distinct categories of features, namely, shape, color, and texture. While textural features were retrieved using GLCM, color features were extracted using the color histogram, color structure descriptor, and color moments (skewness, mean, and standard deviation). The complexity, area, circularity, and perimeter were used as shape features. Singh et al. [59] extracted features using the color co-occurrence approach. Krithika et al. [60] extracted texture features by utilizing GLCM. Ramesh et al. [4] utilized the histogram of oriented gradients (HOG) as a feature extraction method for creating feature vectors.
Behera et al. [5] utilized GLCM for extracting textural features. Tulshan et al. [6] and Devaraj et al. [63], using GLCM, retrieved the relevant features needed for classification. Kumari et al. [70] extracted features from the segmented cluster that contained the leaf segment afflicted by the disease after converting the images to greyscale. Color and texture (extracted using GLCM), two distinct types of features, were retrieved from images in the works by Wahab et al. [7] and Sahithya et al. [47]. Chen et al. [71] utilized RESNET18 (CNN) and a task-adaptive procedure for extracting features. Chouhan et al. [65] utilized scale-invariant feature transform for extracting features. Jain et al. [55] extracted two main kinds of information from the images: texture features and color features. To extract color features, the skewness, standard deviation, kurtosis, and mean of the color moment equation were used. With the use of a GLCM, the second class of features was extracted. Pandian et al. [34] utilized several optimal convolutional layers for extracting features. Ashwinkumar et al. [66] utilized the MobileNet model, which is based on CNNs, for extracting the necessary information from the images. Table 7 shows a summarized view of the various feature extraction techniques applied in different reviewed studies.  This discussion is purely based on RQ4 (3.4). The evaluated studies utilized various approaches for extracting features, namely, GLCM, HOG, color co-occurrence, global color histogram, etc. Figure 8 makes it clear that 43% of the examined research used GLCM to extract features, with color coherence vector coming in second place, with 14% of the total. Additionally, only 4% of the examined research used the global color histogram and local binary pattern, whereas 5% of the studies used the complete local binary pattern, color histogram, color structure descriptor, color moments, HOG, RESNET18, and task-adaptive process. The overall utilization of the different feature extraction techniques in the reviewed studies is shown in Figure 8.

Discussion for RQ5: What Are the Typical Attributes That Are Used or Extracted?
This section contains information about different features that were extracted during the feature extraction process. Using extracted features to filter the pool of publications, 20 papers were found for this section, out of which 14 were used for analysis and are represented in Table 8.

S. No.
Year and Reference Extracted Features 1.
This section contains information about different features that were extracted during the feature extraction process. Using extracted features to filter the pool of publications, 20 papers were found for this section, out of which 14 were used for analysis and are represented in Table 8. This observation is purely based on the discussion for RQ5 (3.5). The reviewed studies extracted various features during the feature extraction stage, namely, color features, shape features, correlation, texture features, energy, variance, mean, geometrical features, and standard deviation. Figure 9 shows the utilization, in percentages, of the different extracted features used in the evaluated studies. and standard deviation. Figure 9 shows the utilization, in percentages, of the different extracted features used in the evaluated studies. The chart shows that 32% of the evaluated studies extracted texture features, 17% color features, and 12% shape features during the feature extraction stage. This indicates that the majority of the examined studies extracted texture features. In addition, 6% of the research that was reviewed extracted each feature, namely correlation, homogeneity, energy, and contrast, while 3% of the studies retrieved feature vectors, variance, mean, geometrical features, and standard deviation.  The chart shows that 32% of the evaluated studies extracted texture features, 17% color features, and 12% shape features during the feature extraction stage. This indicates that the majority of the examined studies extracted texture features. In addition, 6% of the research that was reviewed extracted each feature, namely correlation, homogeneity, energy, and contrast, while 3% of the studies retrieved feature vectors, variance, mean, geometrical features, and standard deviation. This section involves discussion on different machine learning-and deep learningbased approaches that were utilized by various researchers in their works for the identification and classification of diseases. By sifting through them using existing automated algorithms created for identifying and categorizing plant diseases, 45 publications were found to be relevant to this subject. For analysis, 37 papers were considered.
Rumpf et al. [2] utilized an SVM on hyperspectral data for the purpose of identifying illnesses from sugar beet plant leaves, such as powdery mildew, sugar beet rust, and cercospora leaf spot. Wang et al. [35] employed backpropagation networks (BPNN) to recognize two distinct diseases in grape leaves and two different types of diseases in wheat. To identify disease in chili leaves, image processing techniques were utilized by Husin et al. [36]. Dubey et al. [3] employed multi-class SVM for recognizing and categorizing three diseases, namely, apple rot, apple blotch, and apple scab, which affect apples. Mahlein et al. [72] examined the leaves of sugar beet plants to identify three different plant illnesses using spectral disease indices. Sannakki et al. [37] utilized a feed-forward back propagation neural network (BPNN) for identifying powdery mildew and downy mildew from grape leaves. Es-saddy et al. [38] employed a serial combination of two support vector machines for identifying different types of damage to leaves by Tuta absoluta, leaf miners, and thrips (pest insects), along with late blight and powdery mildew (pathogen symptoms). Fujita et al. [9] used a CNN to identify seven distinct illnesses from cucumber leaves using CNN. Durmus et al. [41] employed two types of deep learning models, namely, SqueezeNet and AlexNet, for detecting illnesses, including leaf mold, bacterial spot, early blight, septoria leaf spot, mosaic virus, target spot, late blight, yellow leaf curl, and spider mites, from tomato leaves. Brahimi et al. [10] utilized CNN to identify nine distinct diseases from tomato leaves.
Liu et al. [42] employed AlexNet's deep CNN to recognize mosaic, rust, alternaria leaf spot, and brown spot in apples. Ferentinos [61] employed DL-based CNN models for identifying diseases in 25 distinct plant species. Ramesh et al. [4] utilized the random forest as a classifier in order to detect diseases in papaya leaves. Ma et al. [13] utilized deep CNN to identify downy mildew, anthracnose, powdery mildew, and target leaf spots from cucumber leaves. Sardogan et al. [14] employed a CNN model based on learning vector quantization (LVQ) to detect and categorize four distinct illnesses in tomato leaves. Behera et al. [5] utilized SVM with K-means clustering to identify four different diseases in oranges (brown rot, citrus canker, stubborn, and melanoses), while fuzzy logic was used to determine the severity of each disease. Geetharamani et al. [16] deployed a nine-layer deep CNN to detect illnesses in 13 different plant species. Francis et al. [62] employed a CNN for the purpose of identifying illness from leaves of the tomato and apple species. Kumari et al. [70] applied image processing techniques and neural networks for the purpose of identifying illnesses in cotton and tomato leaves.
Zhang et al. [17] utilized a CNN with GAP (global average pooling) for detecting several diseases from cucumber leaves, such as gray mold, anthracnose, powdery mildew, downy mildew, black spot, and angular leaf spot. Wahab et al. [7] employed SVM (support vector machine) to locate the cucumber mosaic virus in the chili leaf plant. Adedoja et al. [15] employed NASNet for the identification of diseases. Howlader et al. [64] utilized deep CNN to detect several illnesses, including algal leaf spots, rust, and whitefly, from guava leaves, while Haque et al. [46] employed a convolutional neural network for detecting fruit rot, anthracnose, and fruit canker from the same species. Sahithya et al. [47] utilized a support vector machine (SVM) and an artificial neural network (ANN) for detecting three different illnesses from lady finger leaves, including powdery mildew, leaf spots, and yellow mosaic vein.
Coulibaly et al. [19] utilized transfer learning with feature extraction to identify mildew in pearl millet. Ji et al. [20] employed UnitedModel (CNN) for the purpose of diagnosing three diseases from grape leaves, namely isariopsis, black rot, and esca. Jadhav et al. [24] employed two types of CNNs, AlexNet and GoogleNet, for the purpose of detecting three different illnesses, namely, frogeye leaf spot, brown spot, and bacterial blight, from soybean leaves. Sun et al. [26] applied the DM (discount momentum) deep learning optimizer for the purpose of identifying diseases of 26 distinct classes from 14 different crops. Shrestha et al. [22] utilized a CNN for the detection of different diseases from three different speciespotato, tomato, and bell pepper-including early blight and late blight from potato, bell paper bacterial spot, tomato target spot, tomato mosaic virus, tomato yellow leaf curl virus, tomato bacterial spot, late blight and early blight from tomato, tomato leaf mold, tomato spider mites, and tomato septoria leaf spot. Bedi et al. [73] employed a hybrid technique based on CNN and convolutional autoencoders for the purpose of identifying bacterial spot disease from peach leaves.
Abbas et al. [28] applied DenseNet to synthetic images produced by the Conditional Generative Adversarial Network in order to detect various diseases from tomato leaves. Chen et al. [71] utilized LFM-CNAPS (local feature matching conditional neural adaptive processes), which was developed on the basis of meta-learning, for the detection of 60 distinct diseases from 26 different plants. Akshai et al. [31] employed three distinct models based on convolutional neural networks, namely, VGG, DenseNet, and ResNet, for detecting the black rot, leaf blight, and esca diseases from grape leaves. Kibriya et al. [32] utilized GoogleNet and VGG16 for the purpose of identifying three different diseases in tomato leaves. Sujatha et al. [27] utilized three ML and three DL approaches to classify diseases. Ashwinkumar et al. [66] employed optimal mobile network-based CNN for identifying leaf mold, early blight, target spot, and late blight from the leaves of tomatoes. Table 9 shows a summarized view of the different classification techniques utilized by various researchers for classifying plant diseases.  This observation is solely based on the RQ6 discussion (3.6). As shown in Figure 10, the evaluated studies utilized numerous approaches for classifying plant diseases, including SVM, BPNN, multi-class SVM, SqueezeNet, AlexNet, ANN, VGG-19, ResNet, DenseNet, and others. The figure illustrates that while SVM was employed in five studies for identifying plant diseases, eight of the evaluated publications utilized convolutional neural networks. Secondly, three studies that were examined used DCNN and AlexNet. Third, the VGG16 model, DenseNet, and GoogleNet were utilized in two reviewed studies. Additionally, all other diagnosis methods, such as backpropagation, multi-class SVM, feedforward BPNN, two SVMs, SqueezeNet, CNN with GAP, CNN based on LVQ, NasNet, DM optimizer, autoencoders, VGG, ResNet, inception-v3, and optimal mobile network-based CNN, were each used once by the evaluated studies.
Third, the VGG16 model, DenseNet, and GoogleNet were utilized in two reviewed studies. Additionally, all other diagnosis methods, such as backpropagation, multi-class SVM, feed-forward BPNN, two SVMs, SqueezeNet, CNN with GAP, CNN based on LVQ, NasNet, DM optimizer, autoencoders, VGG, ResNet, inception-v3, and optimal mobile network-based CNN, were each used once by the evaluated studies.

Discussion for RQ7: What Analytical Techniques Are Employed for Improving Image Quality?
This section involves discussion on the techniques that were utilized by various researchers for improving the quality of the images. By filtering them using methods for improving image quality, 16 publications were found for this section. Eleven papers were finally considered for the study.
Wang et al. [35] denoised images of wheat and grape leaves with diseased symptoms using a median filter to improve the image quality. Thangadurai et at. [74] utilized two techniques for image enhancement, namely, color conversion and histogram equalization, to improve the quality of the images. Images from RGB sources were changed to greyscale using color conversion. The images became clearer after histogram equalization. Khirade et al. [57] enhanced the image quality by histogram equalization. Es-saddy et al. [38] and Singh et al. [59] increased the quality of the images during the pre-processing stage. Krithika et al. [60] utilized the following formula during the pre-processing step to improve the greyscale images: Sk,l = (Tk,l − min(T))/(max(T) − min(T)), where T is the original pixel value, S is the new pixel value, and (k,l) are the indices of the pixels.
Tulshan et al. [6] enhanced the quality of the images that were taken from the dataset, which were utilized as inputs during the pre-processing stage. Malathy et al.
[1] performed pre-processing following image data collection to increase the image quality. Cap  This section involves discussion on the techniques that were utilized by various researchers for improving the quality of the images. By filtering them using methods for improving image quality, 16 publications were found for this section. Eleven papers were finally considered for the study.
Wang et al. [35] denoised images of wheat and grape leaves with diseased symptoms using a median filter to improve the image quality. Thangadurai et at. [74] utilized two techniques for image enhancement, namely, color conversion and histogram equalization, to improve the quality of the images. Images from RGB sources were changed to greyscale using color conversion. The images became clearer after histogram equalization. Khirade et al. [57] enhanced the image quality by histogram equalization. Es-saddy et al. [38] and Singh et al. [59] increased the quality of the images during the pre-processing stage. Krithika et al. [60] utilized the following formula during the pre-processing step to improve the greyscale images: S k,l = (T k,l -min(T))/(max(T) -min(T)), where T is the original pixel value, S is the new pixel value, and (k,l) are the indices of the pixels.
Tulshan et al. [6] enhanced the quality of the images that were taken from the dataset, which were utilized as inputs during the pre-processing stage. Malathy et al.
[1] performed pre-processing following image data collection to increase the image quality. Cap et al. [75] utilized LeafGAN, which significantly improved the quality of the images generated during the data augmentation stage, which also improved the proposed model's overall performance. Vallabhajosyula et al. [56] focused on the images' brightness and contrast during the pre-processing stage, which boosted their quality. Ashwinkumar et al. [66] employed a bilateral filter to enhance the quality of the images by removing noise from them. In the pre-processing stage, plant leaf images were used as the input images for a bilateral filter, which improved the image quality by eliminating noise. Table 10 represents a summarized view of the various techniques applied in different studies for improving the quality of the images. This observation is purely based on the basis of discussion for RQ7 (3.7). In the evaluated studies, the quality of the images was improved using various filters, namely, bilateral and median, by histogram equalization during pre-processing and color conversion. Figure 11 was created on the basis of the information in Table 9. et al. [75] utilized LeafGAN, which significantly improved the quality of the images generated during the data augmentation stage, which also improved the proposed model's overall performance. Vallabhajosyula et al. [56] focused on the images' brightness and contrast during the pre-processing stage, which boosted their quality. Ashwinkumar et al. [66] employed a bilateral filter to enhance the quality of the images by removing noise from them. In the pre-processing stage, plant leaf images were used as the input images for a bilateral filter, which improved the image quality by eliminating noise. Table 10 represents a summarized view of the various techniques applied in different studies for improving the quality of the images.

S. No.
Year and Reference Quality of Images Was Improved 1.
2022 [66] Bilateral filter Observation 7 This observation is purely based on the basis of discussion for RQ7 (3.7). In the evaluated studies, the quality of the images was improved using various filters, namely, bilateral and median, by histogram equalization during pre-processing and color conversion. Figure 11 was created on the basis of the information in Table 9.

Discussion for RQ8: What Are the Techniques Utilized for Reducing/Removing
Overfitting? Figure 11. Percentages of techniques used for enhancing image quality.

Discussion for RQ8: What Are the Techniques Utilized for Reducing/Removing Overfitting?
When there is a significant discrepancy in the accuracy values which a model produces for training and testing datasets, it is said to overfit. In this section, the different techniques which were employed by different authors to reduce overfitting are discussed. Twenty-four publications were found for this section after filtering them using techniques to reduce overfitting. Finally, 21 papers were taken into account for analysis. Sladojevic et al. [8] introduced some distortion to the images during augmentation to prevent overfitting. Fujita et al. [9] utilized rotation and flipping operations for data augmentation to lessen overfitting. Durmus et al. [41] applied activation function layers to increase the model's non-linearity, while dropout layers and pooling layers were used to lessen overfitting. Fuentes et al. [11] performed extensive data augmentation in order to prevent overfitting. Liu et al. [42], by using image processing approaches (expanding the training dataset's image count), response-normalizing layers (which enabled local normalization), and swapping out some fully connected layers for convolution layers, lessened the overfitting of the model. Ma et al. [13] minimized overfitting by expanding the dataset of cucumber leaf images using data augmentation techniques. Geetharamani et al. [16] introduced distorted images to the training dataset during image transformation to avoid overfitting. Francis et al. [62] avoided overfitting by setting the dropout value at 0.25. Ji et al. [20] employed several approaches, including an early stop mechanism, data augmentation techniques, and dropout, to minimize the overfitting of the model. Howlader et al. [64] mitigated overfitting by using the ReLU activation function and data augmentation approaches. The formula for the ReLU activation function was given as: F(N) = Max(0, N), where N refers to the number of neurons.
Coulibaly et al. [19] utilized the concept of an early stopping strategy to reduce overfitting. Lijo [25], Abbas et al. [28], Pandian et al. [34], Chen et al. [48], Vallabhajosyula et al. [56], and Kannan E et al. [68] reduced the overfitting of the model using data augmentation techniques. Bedi et al. [73] employed the concept of early halting, and the patience value was set to 5 to prevent model overfitting. Wang et al. [53] utilized 1 × 1 convolution to decrease overfitting. Chen et al. [71] utilized forward propagation in order to avoid overfitting. Chowdhury et al. [30] utilized GAP for the purpose of reducing overfitting. Table 11 shows a summarized view of the various techniques applied in different research for reducing or removing overfitting. Table 11. Summary of ways for reducing or removing overfitting.

S. No.
Year and Reference Technique Utilized from Reducing Overfitting This observation is purely based on the basis of discussion for RQ8 (3.8). In the evaluated studies, overfitting was reduced by various approaches, such as by adding distortion to images, data augmentation, global average pooling (GAP), response-normalizing layers, pooling layers, the early stop mechanism, etc. Figure 12 shows various approaches utilized for reducing the overfitting of the model. Figure 12 shows that in various evaluated studies, overfitting was reduced by data augmentation. This observation is purely based on the basis of discussion for RQ8 (3.8). In the evaluated studies, overfitting was reduced by various approaches, such as by adding distortion to images, data augmentation, global average pooling (GAP), response-normalizing layers, pooling layers, the early stop mechanism, etc. Figure 12 shows various approaches utilized for reducing the overfitting of the model. Figure 12 shows that in various evaluated studies, overfitting was reduced by data augmentation.

Discussion for RQ9: What Are the Different Plant Species That the Evaluated Research is Based on, and What Classes of Diseases Have Been Found by the Evaluated Studies?
This discussion focuses on the disease classes that were discovered in the specific plant species on which the utilized reviewed studies were based.
The model used by Rumpf et al. [2] and Mahlein et al. [72] diagnosed three diseases, namely, sugar beet rust, Cercospora leaf spot, and powdery mildew, from the leaves of sugar beet plants. Wang et al. [35] suggested an approach which detected diseases from two species, namely, grapes and wheat. Downy and powdery mildew were detected in grapes, whereas leaf rust and stripe rust were found in wheat. Dubey et al. [3] suggested an approach which diagnosed apple rot, apple scab, and apple blotch from images of apples [3]. The proposed model of Sannakki et al. [37] detected two distinct classes of mildew, namely, powdery and downy, from images of grape leaves. Es-saady et al. [38] diagnosed diseases caused by pest insects (thrips, leaf miners, tutaabsoluta) and pathogens (early and late blight, powdery mildew) from leaf images. Fujita et al. [9] proposed a  This discussion focuses on the disease classes that were discovered in the specific plant species on which the utilized reviewed studies were based.
The model used by Rumpf et al. [2] and Mahlein et al. [72] diagnosed three diseases, namely, sugar beet rust, Cercospora leaf spot, and powdery mildew, from the leaves of sugar beet plants. Wang et al. [35] suggested an approach which detected diseases from two species, namely, grapes and wheat. Downy and powdery mildew were detected in grapes, whereas leaf rust and stripe rust were found in wheat. Dubey et al. [3] suggested an approach which diagnosed apple rot, apple scab, and apple blotch from images of apples [3]. The proposed model of Sannakki et al. [37] detected two distinct classes of mildew, namely, powdery and downy, from images of grape leaves. Es-saady et al. [38] diagnosed diseases caused by pest insects (thrips, leaf miners, tutaabsoluta) and pathogens (early and late blight, powdery mildew) from leaf images. Fujita et al. [9] proposed a model which identified a total of seven distinct classes of diseases from cucumber leaf images, of which four classes were caused by mosaic viruses, including zucchini yellow, cucumber mosaic virus, watermelon mosaic virus, and kyuri green mottle mosaic virus. Three classes were caused by other viruses, including melon yellow spot virus, cucurbit chlorotic yellows virus, and papaya ring spot virus.
Durmus et al. [41] and Brahimi et al. [10] proposed a model which was utilized for identifying nine classes of diseases, namely, leaf mold, early and late blight, yellow leaf curl virus, bacterial spot, septoria leaf spot, mosaic virus, target spot, and spider mites, from images of tomato leaves. Liu et al. [42], using the AlexNet-based model, diagnosed four distinct classes of diseases, namely, rust, alternaria leaf spot, mosaic, and brown spot, from the leaves of apples. Ferentinos [61] suggested an approach for the purpose of recognizing 58 kinds of diseases from leaf images of 25 different plant species. Ramesh et al. [4] presented a method for identifying healthy and unhealthy papaya leaves. Ma et al. [13] proposed a model which diagnosed four distinct categories of cucumber diseases, namely, target leaf spots, downy and powdery mildew, and anthracnose, from cucumber leaf images. Sardogan et al. [14] presented a model for identifying four groups of diseases from images of tomato leaves, including septoria spot, bacterial spot, tellow curved, and late blight. Behera et al. [5] proposed an approach for detecting brown rot, citrus canker, melanoses, and stubbornness from images of oranges. Geetharamani et al. [16] suggested a technique for identifying 38 classes from images of the leaves of 13 distinct plant species. Francis et al. [62] suggested an approach for categorizing healthy and diseased leaves of two species, namely, apple and tomato.
Ji et al. [20] presented a UnitedModel for diagnosing three classes of grape diseases, namely, isariopsis leaf spot, esca, and black rot, from images of grape leaves. Kumari et al. [70] proposed a model for diagnosing diseases from images of cotton and tomato leaves. It identified two classes of cotton diseases, target spot and bacterial leaf spot, and two species of tomato diseases, namely, leaf mold and septoria leaf spot. Zhang et al. [17] suggested an approach which detected six distinct classes of cucumber diseases. The proposed model of Wahab et al. [7] identified cucumber mosaic virus from images of leaves of the chili plant. Haque et al. [46] presented a methodology for diagnosing fruit rot, anthracnose, and fruit canker from images of guava, whereas Howlader et al. [64] proposed an approach for identifying rust, algal leaf spot, and whitefly from images of guava leaves. Sahithya et al. [47] diagnosed three distinct classes of diseases, namely, powdery mildew, leaf spot, and yellow mosaic vein, from images of lady finger leaves. Coulibaly et al. [19] presented an approach for the purpose of identifying mildew in pearl millet. Jadhav et al. [24] proposed a methodology which identified brown spots, bacterial blight, and frogeye leaf spots from images of soybean leaves.
Kannan E et al. [68] diagnosed diseases, namely, yellow leaf curl, septoria leaf spot, early blight, mosaic virus, and bacterial spot, from tomato leaf images. Sun et al. [26] proposed a model which identified 26 classes of disease from the leaves of 14 plant species. Pham et al. [51] identified three types of diseases from images of mango leaves, including powdery mildew, gall midge, and anthracnose. Shrestha et al. [22] proposed a model that diagnosed twelve classes of disease from the leaves of three species, namely, potato (two classes), tomato (nine classes), and bell pepper (one class). The 12 classes of diseases included late and early blight (potato); bell pepper bacterial spot; and, in tomato plants, yellow leaf curl virus, target spot, mosaic virus, septoria leaf spot, early blight, bacterial spot, leaf mold, late blight, and spider mites. Bedi et al. [73] presented a model which detected bacterial spots from images of peach leaves. Vallabhajosyula et al. [56] proposed a deep ensemble neural network to diagnose 38 classes of diseases from 14 plant species. Abbas et al. [28] suggested a methodology for identifying nine classes of diseases from images of tomato leaves, namely, yellow leaf curl virus, bacterial spot, septoria leaf spot, two-spotted spider mite, target spot, early blight, leaf mold, late blight, and mosaic virus. Chen et al. [71] suggested an approach which diagnosed 60 distinct classes of diseases from the leaf images of 26 plant species.
Akshai et al. [31] proposed CNN-based models, which they utilized for diagnosing black rot, leaf blight, and esca from images of grape leaves, which were acquired from the plant village dataset. Malathy et al.
[1] proposed a CNN for diagnosing diseases, namely, bitter rot, powdery mildew, and sooty blotch, from images of apples. Kibriya et al. [32] diagnosed early and late blight and bacterial spot, whereas Ashwinkumar et al. [66] proposed a model for identifying leaf mold, early and late blight, and target spot from images of tomato leaves. Table 12 shows a summarized view of the plant species and classes of diseases detected and classified by the reviewed studies.  This observation is purely framed on the basis of the discussion for RQ9 (3.9). As per Figure 13, it is evident that the evaluated studies mostly worked on classifying tomato diseases (13 evaluated studies). Secondly, the number of evaluated studies classifying diseases for the apple and grape species was equal, at seven. Four evaluated studies were conducted to classify diseases in cucumber, orange, peach, pepper, potato, and soybean, whereas two reviewed studies included guava and sugar beet. Lastly, diseases in chili, papaya, cotton, wheat, pearl millet, etc., were diagnosed by one reviewed study. Figure 13 depicts the various species for which diagnoses were made by the evaluated studies. This observation is purely framed on the basis of the discussion for RQ9 (3.9). As per Figure 13, it is evident that the evaluated studies mostly worked on classifying tomato diseases (13 evaluated studies). Secondly, the number of evaluated studies classifying diseases for the apple and grape species was equal, at seven. Four evaluated studies were conducted to classify diseases in cucumber, orange, peach, pepper, potato, and soybean, whereas two reviewed studies included guava and sugar beet. Lastly, diseases in chili, papaya, cotton, wheat, pearl millet, etc., were diagnosed by one reviewed study. Figure  13 depicts the various species for which diagnoses were made by the evaluated studies. This section focuses on accuracy of the existing approaches that were proposed by the evaluated studies.  This section focuses on accuracy of the existing approaches that were proposed by the evaluated studies.
Rumpf et al. [2], for the diagnosis of several diseases in sugar beet, proposed an SVM model based on hyperspectral reflectance which offered accuracy levels greater than 86%. Wang et al. [35] used a model for predicting two different grape diseases with an accuracy of 97.14%, while two types of wheat diseases were detected with 100% accuracy using BPNN and image processing technologies. Dubey et al. [3] proposed an approach that attained 93% accuracy in identifying various diseases in apples, namely, apple rot, apple scab, and apple blotch. Mahlein et al. [72] proposed a model for detecting sugar beet diseases which achieved accuracy rates for sugar beet rust, powdery mildew, and Cercospora leaf spot of 87%, 85%, and 92%, respectively. Sannakki et al. [37] proposed a model which achieved 100% accuracy in identifying two different grape illnesses by utilizing the hue feature. Es-saddy et al. [38] proposed a model that attained an accuracy of 87.80%. The proposed CNN model of Fujita et al. [9] attained 82.3% accuracy for detecting various cucumber diseases. Durmus et al. [41] used AlexNet and SqueezeNet, two DL-based models, which attained 95.65 and 94.3 percent accuracy, respectively, whereas a 99.18% accuracy rate was attained by Brahimi et al. [10], who used CNN for identifying the same illnesses from tomato leaves.
Liu et al. [42] used an AlexNet-based model which attained 97.62% accuracy for identifying different apple diseases. Ferentinos [61] proposed a CNN model that obtained an accuracy of 99.53% for detecting 58 classes of diseases. Ramesh et al. [4] proposed a random forest classifier that provided an accuracy of 70% for detecting healthy and unhealthy papaya leaves. Ma et al. [13] attained an accuracy of 93.4% using the proposed deep CNN to identify various kinds of cucumber leaf diseases. Sardogan et al. [14] achieved 86% accuracy in the detection of septoria spot, bacterial spot, yellow curved, and late blight from tomato leaves by utilizing CNN and learning vector quantization. Behera et al. [5] proposed SVM with k-means clustering obtained an accuracy of 90% for detecting orange diseases. Geetharamani et al. [16] attained an accuracy of 96.46% for identifying diseases from the leaves of 13 different species of plants. Wahab et al. [7] proposed model provided an accuracy of 57.1% for detecting cucumber mosaic diseases from chili leaves. Francis et al. [62] suggested a CNN model which achieved 87% accuracy in the identification of diseases in apple and tomato leaf species. Ji et al. [20] proposed a UnitedModel, which attained a test accuracy of 98.57%.
The proposed approach of Kumari et al. [70] attained 90% accuracy in identifying bacterial leaf spots and 80% in diagnosing target spots from cotton leaves, whereas it provided an accuracy of 100% in identifying two distinct classes of tomato diseases from its leaves. Zhang et al. [17] proposed a model which attained 94.65% accuracy for detecting downy mildew, anthracnose, black spot, powdery mildew, angular leaf spot, and gray mold diseases using a convolutional neural network with global average pooling from cucumber leaves. Haque et al. [46] proposed a CNN which achieved an accuracy of 95.61% for diagnosing diseases from guava, whereas the suggested deep CNN model of Howlader et al. [64] attained an accuracy of 98.74% in identifying fruit rot, anthracnose, and fruit canker from guava leaves. Sahithya et al. [47] proposed SVM and ANN models for identifying various diseases from lady finger leaves, which showed variation when tested using datasets with and without noise images. SVM provided an accuracy of 85% when noise was present in the images and 92% when tested for images without noise, whereas ANN provided 97% accuracy when noise was present in the images and 98% in images without noise. Coulibaly et al. [19] attained an accuracy of 95% in diagnosing mildew from pearl millet. Jadhav et al. [24] proposed an AlexNet which attained 98.75% accuracy, whereas GoogleNet attained 96.25% accuracy for identifying diseases from soybean leaves. Kannan E et al. [68] proposed a CNN model that obtained an accuracy of 97% in detecting diseases, namely, early blight, mosaic virus, septoria leaf spot, yellow leaf curl, and bacterial spot, from tomato leaves.
Sun et al. [26] proposed a discount momentum deep learning optimizer which attained an accuracy of 97% for detecting 26 classes of diseases. Pham et al. [51] suggested a model which achieved a testing accuracy of 85.45% for identifying diseases from mango leaves. Shrestha et al. [22] obtained an accuracy of 88.8% using the proposed CNN model for diagnosing diseases in tomato, potato, and bell pepper leaves. The research of Sujatha et al. [27] revealed that when it comes to identifying citrus plants, DL models showed a superior performance to ML models. Different ML models, such as SVM, stochastic gradient descent, and random forest, achieved accuracies of 87%, 86.5%, and 76.8%, respectively. In contrast, three DL models, namely, Inception-v3, VGG-16, and VGG-19, provided disease detection accuracies of 89%, 89.5%, and 87.4%, respectively, for the same species. Bedi et al. [73] suggested an approach based on CNN and convolutional autoencoders which attained an accuracy of 98.38%. Vallabhajosyula et al. [56] proposed a deep ensemble neural network technique which obtained 99.99% accuracy when the performance was assessed using the PlantVillage dataset. The accuracies of the C-GAN model, provided by Abbas et al. [28], for five classes, seven classes, and ten classes of tomato leaf images were 99.51%, 98.65%, and 97.11%, respectively.
Chen et al. [71] suggested a model which attained an accuracy of 93.9% for identifying 60 classes of diseases from 26 different plant species. Akshai et al. [31] proposed a DenseNet model which achieved 98.27% accuracy for diagnosing black rot, leaf blight, and esca from images of grape leaves. Kibriya et al. [32] proposed two models for identifying diseases from tomato leaves, namely, GoogleNet and VGG16. GoogleNet obtained an accuracy of 99.23%, whereas VGG-16 attained 98% accuracy. Malathy et al.
[1] proposed a CNN which obtained an accuracy of 97% for diagnosing diseases from images of apples. Ashwinkumar et al. [66] suggested an optimal mobile network-based CNN, which achieved an accuracy of 98.7% for detecting various diseases, namely, late blight, target spot, leaf mold, and early blight, from images of tomato leaves.

Observation 10
This observation is solely based on the discussion of RQ10. Three categories were used to classify the accuracy levels attained by the various reviewed studies. The accuracies achieved by various existing models are compared in Figure 14 in three classes: ≤85%, 86-90%, and >90%. It was found that 73% of the evaluated studies offered plant disease diagnosis accuracy of more than 90%, while 14% offered accuracies of between 86 and 90%. The percentage of the examined studies with accuracy levels of 85% or less was only 13%.  [32] proposed two models for identifying diseases from tomato leaves, namely, GoogleNet and VGG16. GoogleNet obtained an accuracy of 99.23%, whereas VGG-16 attained 98% accuracy. Malathy et al.
[1] proposed a CNN which obtained an accuracy of 97% for diagnosing diseases from images of apples. Ashwinkumar et al. [66] suggested an optimal mobile network-based CNN, which achieved an accuracy of 98.7% for detecting various diseases, namely, late blight, target spot, leaf mold, and early blight, from images of tomato leaves.

Observation 10
This observation is solely based on the discussion of RQ10. Three categories were used to classify the accuracy levels attained by the various reviewed studies. The accuracies achieved by various existing models are compared in Figure 14 in three classes: ≤85%, 86-90%, and >90%. It was found that 73% of the evaluated studies offered plant disease diagnosis accuracy of more than 90%, while 14% offered accuracies of between 86 and 90%. The percentage of the examined studies with accuracy levels of 85% or less was only 13%.

Challenges in Existing Approaches
These discussions were solely based on the literature that was reviewed for plant diseases; the conclusions might be different for applications of image processing, ML, and DL in other fields.

•
The analysis of disease classification can be impacted by environmental factors such as temperature and humidity; • It is difficult to identify appropriate and unhealthy portions of leaves because disease symptoms are not well defined; • Some models were unable to identify a certain stage of a plant leaf disease; • Some models failed to extract the desired impacted area from images with intricate backgrounds; • Several of the methods discussed in this review study were trained using the publicly available PlantVillage dataset, but they fell short when put to the test against a realworld environment.

Overall Observation and Comparison
This section involves overall observation and comparison. The overall observation was framed on the basis of Observations 1 to 10, as shown in Figure 15. The comparison section involves a comparison of various parameters, as shown in Figure 16. These discussions were solely based on the literature that was reviewed for plant diseases; the conclusions might be different for applications of image processing, ML, and DL in other fields.

•
The analysis of disease classification can be impacted by environmental factors such as temperature and humidity; • It is difficult to identify appropriate and unhealthy portions of leaves because disease symptoms are not well defined; • Some models were unable to identify a certain stage of a plant leaf disease; • Some models failed to extract the desired impacted area from images with intricate backgrounds; • Several of the methods discussed in this review study were trained using the publicly available PlantVillage dataset, but they fell short when put to the test against a realworld environment.

Overall Observation and Comparison
This section involves overall observation and comparison. The overall observation was framed on the basis of Observations 1 to 10, as shown in Figure 15. The comparison section involves a comparison of various parameters, as shown in Figure 16.

Overall Observation
The majority of the reviewed studies obtained image data from publicly available datasets, as is evident from Observation 1. Secondly, Observation 2 indicates that resizing was utilized for pre-processing the images, whereas Observation 3 reflects the size of the dataset, or, simply, that the count of images in the dataset was increased using rotation operation during the data augmentation stage in most of the evaluated studies. Thirdly, Observation 4 indicates that during feature extraction, GLCM was widely utilized, and Observation 5 reflects the texture features extracted by most of the evaluated studies. The plant diseases were classified using CNN in many of the publications that were reviewed, as demonstrated by Observation 6. In the majority of the studies that were analyzed, the quality of the images was improved during pre-processing, as shown by Observation 7, while Observation 8 reveals that data augmentation helped to decrease the overfitting of the models. Last but not least, Observation 10 demonstrates that the majority of the reviewed studies offered accuracy levels greater than or equal to 90%. Table 13 involves a comparison of various reviewed papers on the basis of the species evaluated, the techniques used for identification, the disease identified, the performance measures, and their value.

Overall Observation
The majority of the reviewed studies obtained image data from publicly available datasets, as is evident from Observation 1. Secondly, Observation 2 indicates that resizing was utilized for pre-processing the images, whereas Observation 3 reflects the size of the dataset, or, simply, that the count of images in the dataset was increased using rotation operation during the data augmentation stage in most of the evaluated studies. Thirdly, Observation 4 indicates that during feature extraction, GLCM was widely utilized, and Observation 5 reflects the texture features extracted by most of the evaluated studies. The plant diseases were classified using CNN in many of the publications that were reviewed, as demonstrated by Observation 6. In the majority of the studies that were analyzed, the quality of the images was improved during pre-processing, as shown by Observation 7, while Observation 8 reveals that data augmentation helped to decrease the overfitting of the models. Last but not least, Observation 10 demonstrates that the majority of the reviewed studies offered accuracy levels greater than or equal to 90%. Table 13 involves a comparison of various reviewed papers on the basis of the species evaluated, the techniques used for identification, the disease identified, the performance measures, and their value.

Conclusions and Future Scope
Diverse available techniques using ML, DL, and image processing were surveyed in this research to determine their applicability to diagnosing illnesses in various plant species. By looking into the field of agriculture for this effort, 75 pertinent articles were selected for this review. Attention was particularly paid to the data sources, pre-processing methods, feature extraction methods, data augmentation methods, utilized models, and general effectiveness of proposed models. The results showed that most existing models have a modest capacity to process original image data in its unstructured state. For the purpose of separating the desired impacted area from the complicated background of an image, identification techniques based on different approaches required systematic engineering and expert design abilities.
This survey's objective was to encourage researchers to use various image processing, ML, and DL approaches for identifying and categorizing plant diseases. Most of the reviewed studies worked on images of single leaves for disease detection; in future work, multiple leaves in a single frame may be used for disease detection. These images could be captured in diversified environmental conditions (temperature, humidity, etc.), for the purpose of reducing the impact of environmental conditions on disease detection, and new approaches could be developed which provide detail regarding the stage of the disease. Moreover, in the future, plant disease detection approaches can be integrated with drones and mobile applications to detect diseases in their early stages in large agricultural fields.
Author Contributions: All authors carried out the review of existing literature and searched for gaps in the existing work. All authors prepared questionnaires for conducting the review and helped to draft the manuscript. All authors have read and agreed to the published version of the manuscript. Data Availability Statement: Data will be available on request.